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Abstract: In this paper, I apply the decision theory and empirical Bayesian 
approach to construct confidence intervals for selected populations when 
true parameters follow a mixture prior distribution. A loss function with 
two tuning parameters ki and k2 is coined to address the mixture prior. One 
specific choice of k2 can load to the procedure in Qiu and Hwang (2007); 
the other choice of k2 provides an interval construction which controls the 
Bayes FCR. Both the analytical and extensive numerical simulation stud- 
ies demonstrate that the new empirical Bayesian FCR controlling approach 
enjoys great length reduction. At the end, I apply different methods to a 
microarray data set. It turns out that the average length of the new ap- 
proach is only 57% of that of Qiu and Hwang's procedure which controls 
the simultaneous non-coverage probability and 66% of that of Benjamini 
and Yekutieli (2005)'s procedure which controls the frequcntist's FCR. 

AMS 2000 subject classifications: Decision Bayes, Loss Function, Si- 
multaneous Intervals.. 



1. Introduction 

Simultaneous interval estimation for a large number of selected parameters is 
challenging especially when the number of observations for each parameter is 
very small. The difficulties arc the selection bias (sec Qiu and Hwang 2007 and 
Hwang 1993) and the multiplicity. The traditional approach, which treats all 
the parameters as fixed, seems to have little power when the dimension tends 
to be very large, for instance, several thousands in microarray. However, the 
empirical Bayesian approach is known to be able to borrow strength across the 
populations. Thus, it is very likely that this method will provide us with some 
satisfactory procedures. 

In the past, people attempted to estimate the parameters for selection pop- 
ulations (see, for example, Cohen and Sackrowitz 1982 and Hwang 1993). How- 
ever, very few people knew how to construct an interval for selected population. 
The first exciting work was written by Benjamini and Yekutieli (2005) (I will 
use B-Y (2005) to represent this work throughout the paper). They adapted 
the concept of FDR from multiple testing and coined a concept False Coverage 
Rate (FCR) for simultaneous intervals. This criterion is much less conservative 



*This is an original survey paper 





imsart-ejs ver. 2008/08/29 file: ejs_2009_359.tex date: January 15, 2009 



Z. Zhao/EB Interval 



1 



than the simultaneous non-coverage coefficient. They have constructed confi- 
dence intervals for multiple selected parameters which can control the FCR at 
specified g-level, typically 5%. They centered their intervals upon the estimator 
Xi's which are biased for selected populations and addressed the multiplicity 
by lengthening the intervals. Consequently, their intervals have extremely large 
average half length. 

In 2008, Zhao and Hwang introduced the Bayes FCR and connected Bayes 
confidence interval which aims at controlling Bayesian non-coverage coefficients 
with the Bayesian FCR controlling intervals. They applied this general theorem 
to the normal-normal setting where the observations follow a normal distribution 
with unequal but know variances and the parameters follow a normal prior. 
They used the empirical Bayesian approach to derive explicit intervals which 
can control the empirical Bayes FCR. Their construction reduced the average 
length of B-Y's procedure dramatically because they addressed the multiplicity 
by modifying the centers instead of the lengths. 

Another exciting work is Qiu and Hwang (2007), which offers a way to con- 
struct intervals that can control the simultaneous coverage coefficient for selected 
popultions. Other than the normal-normal model, they treated the so-called 
normal-mixture model where the prior distribution of the true parameters is 
a mixture of a normal random variable with an equal, known variance and a 
single point zero. Because they have addressed the multiplicity by Bonferroni's 
correction, their lengths tend to be large when many parameters are selected. 

In this paper, I use the decision approach and empirical Bayes to construct 
intervals for selected populations under the same model setting of Qiu and 
Hwang (2007). Application of decision approach to interval/set estimation has 
a long history which dates back to Faith (1976), Casella and Hwang (1983), and 
He (1992). Recently, Hwang, Qiu, and Zhao (2008) have constructed the double 
shrinkage empirical confidence interval for one single parameter when assuming 
the variances to be unequal and unknown. However, all the loss functions they 
have used are not appropriate under the mixed prior model (Detailed argument 
is in section [2. 2p . Thus a new loss function with two tuning parameters fci and 
fc2 is proposed and strongly recommended. One specific choice of k2 results in 
Qiu and Hwang (2007) 's procedure. The other choice of k2 provides us with a 
way to construct the empirical Bayesian FCR-controUing intervals based on the 
normal-mixture model. 

In section [2j I introduce the model setting and the decision Bayes rule based 
on our new loss function. In section [3l I will connect the decision Bayesian 
rule with Qiu and Hwang (2007) 's procedure first and then derive a procedure 
which can control the Bayes FCR. In section [U empirical Bayesian approach 
is constructed and evaluated both numerically and analytically. In section [51 
I apply the confidence intervals constructed in section |4] to a real microarray 
dataset and compare it with B-Y (2005) 's and Qiu and Hwang's procedures. 
It turns out that my procedure out-performs theirs. The average length of my 
interval is only 57% of that of Qiu and Hwang's (2007) procedure which controls 
the simultaneous coverage probability and 66% of that of B-Y (2005) 's procedure 
which controls the frequentist's FCR. 
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2. Normal-Mixture Model for the means 
2.1. Model Assumption 

In microarray, it is generally assumed that observed differentially expressed 
levels Xi^s are normally distributed with true means ^^'s, i = 1,2, ■ ■ ■ ,p, where 
the dimension p varies from several thousands to over thirty thousands. Due to 
the extremely large number of dimensions, it is strongly recommended to use a 
prior to model the true means ^i's. A natural choice is the normal prior where 

O, "^''- N{0,T^). 

However, in Qiu and Hwang (2007), they have applied the Q-Q plot to a 
microarray data and shown that normal-normal model cannot fit the data well. 
To remedy this, they introduced the normal-mixture model as following. Assume 
that Xi\9i ^ N{e^,a^), and 

, , J =0 with probability ttq, 

\ - A^(0, r2) with probability tti = 1 - ttq. ' 

I use an indicator function li to describe whether di is 0, i.e. = if di = 
and /i = 1 if ~ -^(0, t^). Initially, I assume that hyper parameters and ttq 
are known and derive the corresponding decision Bayesian procedure. In section 
m I estimate them through data by using consistent estimators and derive a 
empirical Bayesian procedure. 



2.2. Bayes Interval 

In history, there are many attempts to apply the decision Bayes approach to 
construct confidence sets/intervals. Faith (1976) first introduced a linear loss 
function for confidence set CI of the parameter 9 as L{9, CI) — kVolume{CI) — 
Ici{d) where the tuning parameter k was determined by some minimax rule in 
Casella and Hwang (1983). He (1992) used U{e^, CI,) = kLen{CIi)~ IciM) as 
the loss function for the interval estimator Cli of the parameter Oi. Hwang Qiu 
and Zhao (2008) modified the loss function above as L{9i, Cli) = ^ Len{C I,) — 
Icii (^i) s-iid constructed the double shrinkage confidence interval when assuming 
variances to be unequal and unknown. However, all these loss functions arc not 
appropriate for normal- mixture model ([1]). In fact, for any given confidence 
interval, one can construct a new interval, which is the union of the existing 
procedure and zero. This new approach boosts the coverage probability while 
causes no change of the length. Consequently, the conditional expected loss of 
the new construction is always less or equal than that of the original approach. 
As a result, the decision Bayes suggests that zero should be included in every 
interval. But practically, such constructions have no power and appear to be 
useless. 

In order to avoid this phenomenon, I put extra terms which influence the loss 
function only when the point zero is included and thus define the loss function 
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as, 

L{e„CI,) = kiLen{CIi)I{h = l)-IciM,h = l)+/c/. (0)(fe-/(/, = 0)),0 < fcs < 1- 

(2) 

The first two terms balance the length and the true coverage when the true 
parameters OiS are generated from the normal random variable. The tuning 
parameter fci will be determined later in this section. The last two terms affect 
the loss function only when the corresponding interval does include zero. Upon 
this, if Oi is indeed zero, then k2 — I{Ii = 0) = ^2 — 1 < 0, implying that including 
zero is useful. On the other hand, if Oi is not zero, then k2 — = 0) is positive 
and becomes a penalty term. Thus, appropriate choice of the tuning parameter 
k2 guides us to decide when zero should be included. 

Furthermore, the flexibility of choosing k2 offers us constructions under dif- 
ferent settings. For example, when assuming the normal-normal model, the loss 
function ([2]) reduces to He (1992) 's if I set ^2 = 0. In section [H I apply two 
different choices of ^2, one of which will reproduce Qiu and Hwang (2007) 's 
procedure and the other of which provides a construction that can control the 
Bayesian FCR at q-lcvcl. 

Now, I have all the pieces to construct the decision Baycs rule, i.e. I want 
to construct a Bayes interval Clf such that it minimizes E{L{9i,CIi\X)) for 
any observation X when assuming the normal-mixture model ([T]) and the loss 
function 

Theorem 2.1 Let tt^{X) = P{9, = 0\X) = P{I, ^ 0|X) and 7r,i(X) = 1 - 
7rf'(X). Then 

EL{e„Ch\X) = t:}{X) [ (fci - n{e,\X,h = l))de, + IciMX){k2 - ^°(X)). 

•ICh 

(3) 

The Bayes interval is 

r {e, : fci < 7r{e,\X,,h = 1)} \ {0} tfk2 > nfiX), 
^''-\ {9r.k,< 7r{9,\X,, 1,^1)} U{Q} ifk2<n^{X). ^ 

Intuitively, for any given observation Xi, if the conditional probability P{9i = 
0|X) is small, it is very unlikely that Oi = Q and zero should be included. On 
the other hand, larger 'k'^\X) indicates that zero should be included. Theorem 
12. II tells us that the parameter k2 is the threshold value. 

Under model (P), ■K{e,\X,h 1) N{MX,,Ma'^) where M = ^^q^, there- 
fore 

{9, : fci < T:{9,\X,,h = 1)} = {Or : {9,-MXif < -Ma^i2\ogkiV2^ + \ogMa^)}. 

As in the Section 3 of Hwang, Qiu, and Zhao (2008), one wants to obtain a 
traditional normal interval when the non- informative prior is applied, i.e., if 
setting T ^ oo, M —y 1, one wants the corresponding interval {9i : ^ ' ^.a < 
— (2 log A:i\/27r+ log cr^)} to coincide with normal interval (X^ — 2^/2^ , Xi + Zg/20-) 
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where Zg/2 is the critical value such that J^d^l > Zg/2) ~ q when Z is a standard 
normal random variable. Therefore, the constant ki should be chosen such that 
-(2 log fci V27r + logtT^). Plug this constant ki back to Bayes interval (|4]). 



z 



2 



9/2 



Then the decision Bayes interval becomes 

^jB _ / {(^^ ■■ ((^^ - MX,)^ < McT^zlf^ - logil/)} \ {0} if fc2 > 

' ~ 1 {e, : {e, - MX,f < Ma^{zl^ - log A/)} U {0} if < nfiX). 



Unlike the interval MXi^v MaZq/2i which is directly derived from the posterior 
distribution, the major part of ([5]) has an extra positive term Ma'^{— log M) 
which is necessary to boost the coverage probability when the hyper parameters 
are estimated through the data in section [D In the next section, I will choose 
the value of the parameter ^2 under two different problem settings and derive 
the decision Bayes interval accordingly. 

3. Choose ^2 

3.1. Qiu and Hwang (2007) 

Qiu and Hwang (2007) constructed the interval for K selected populations 
X(p_if+i), ^(p_/f+2)7 •■• I ^(p) under the model ([T]) where the observations 
X(j)'s satisfy 

|^(1)|<|^(2)|<---<|^(P)|. 

Assume is the true parameter that corresponding to the observation ^(j). 
Note that \d{p)\ is not necessarily equal to max \0[j)\. I construct the interval 

for 9ij) where p — /v + 1 < j < p as 



: - MX(,)f < MaHz^,/2K ' ^^SM)} \ {0} if > n^^iX), 
: - AfX(,))2 < MaHzl^,^ - log M)} U {0} if < n^^^ {X). 



When compared with ([5|), the major difference is that I use the critical value 
Zq/2K to address the multiplicity. This is known as Bonferroni's correction. 
Direct calculation shows that for each j, 

P(%) i < q/K + t:1){X){I{^1-^{X) < k2) - q/K). 

Consequently, the simultaneous non-coverage coefficient satisfies 

P(0(,) i CI^,),j =p-K+l,-- ■ ,p\X) <q+ J2 '^0)(^)(^(^0)(^) < k2yi/K)- 

(7) 

If k2 is chosen to be the maximum k such that the summation above is non- 
positive, i.e. 

fc2=max{ ^2 ^UX)iMjM)<k)-q/K)<Q}. (8) 

j=p-K+l 
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Then the non-coverage coefficient P{6i^j) ^ CI(j),j = p — A" + 1, • • • ,p) is con- 
trolled at the g-level. Surprisingly, this choice of fc2 is exactly the same as in 
Theorem 4 of Qiu and Hwang (2007). Therefore, Qiu and Hwang (2007) 's Bayes 
procedure is exactly the same as ([6]). 

3.2. Bayes FCR Controlling Interval 

Benjamini and Yekutieli (2005) initiated the concept of FCR, which is much less 
conservative than the simultaneous non-coverage coefficients. Zhao and Hwang 
(2008) have extended this idea to the Bayesian framework through a new con- 
cept, Bayes FCR. They have shown that there is a natural connection between 
the Bayes FCR and the Bayes confidence interval. In this subsection, I will show 
that ^ can control the Bayes FCR at the (/-level if ^2 is chosen appropriately. 

Theorem 3.1 Assume that TZ{X) is the index set of observations that are se- 
lected for interval estimation. R = 4j=TZ. Define 

/(p, r'. ... . EC£. ^mm^i^3^,^n > 0)), 

and k2 — maxjfc, /(p, t^, ttq. A;) < 0}. Then intervals (0) satisfies 

k 

FCR^ < qP{R > 0). 
In other words, the Bayes FCR of the intervals {J) has been controlled at q level. 

Now assume that the selection rule in Qiu and Hwang (2007) is applied, i.e., 
the last K observations after ordering Xi,X2, ■ ■ ■ ,Xp according to their abso- 
lute value increasingly are selected. Then f{p, r^, ttq, k) ~ -£'(X]i'=p-/-f+i — ^ '•^^ — ^). 
Comparing it with the expectation part in ([S]), / is always smaller when K > 1, 
which implies that the choice of ^2 for controlling Bayes FCR interval is always 
larger than the choice of Qiu and Hwang's. Consequently, under the same set- 
ting, the frequency that ([5]) includes zero is less than that of Qiu and Hwang's. 
Furthermore, since they addressed the multiplicity by Bonferroni's correction, 
the half length Ma^{zq/2K ~ logAf) is much larger than the half length of 
Bayes FCR controlling interval ([5]) and the discrepancy becomes large when K 
is big. These two facts all implied that Bayes FCR controlling interval is less 
conservative than Qiu and Hwang (2007). 

Another nice thing about this theorem is that it holds for any selection rule, 
including pre-determined and data-driven selection rule. For example, when ob- 
servations are selected according to Benjamini and Hochberg (1995)'s procedure 
which controls the False Discovery Rate at g-level and fc2 is simulated accord- 
ingly, the above theorem still guarantees that ([5]) controls the Bayes FCR at 
g-level. 

The choice of k2 depends on the unknown expectation, which prevents us 
from finding ^2 explicitly. However, k2 can be easily determined by simulation 
once the hyper-parameters are known. 
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4. Empirical Bayes Approach 

In this section. I estimate unknown hyper-parameters through the data and ob- 
tain a practical confidence interval. Our goal is to construct the confidence inter- 
vals for selected parameters such that the Bayes FCR can always be controlled 
for a class of prior distributions which are determined by the hyper-parameters 
ttq and r^. This approach is named empirical Bayes FCR controlling intervals, 
according to Zhao and Hwang (2008). 

Recall the model [H Then EXf = a'^ + tt-^t"^ , and EXf = ?,{a'^ + 2771(7^2 -f 
TTiT^). By using the method of moments, one could get reliable estimators of ttq 
and when p is sufficiently large, 

'^i = TT, — 4 — = ' ■ (9) 

Plug these two estimators back to the function of / and simulate the value of 
fc2, which is denoted by ^2- Assume that M and 7r°(X) are the estimators of 
M and 7r-'(A) when ttq and are replaced by ([9]). Then I can construct the 
empirical Bayes interval as. 



CjEB^l {e.:{e.-MX,Y <Ma\zl,^- log M)}\{Q} \ik>f1{Xl 

\ {e,:{6,-MX,f <Ma\zl^-\ogM)}\J{Q} iffc2<^°m. ^ ^ 

The following theorem describes the asymptotic property of the construction. 

Theorem 4.1 For any < ttq < 1, > 0, i/Ve > 0, there 35, N > 0, such 
that \/p > N, k, k' > Q, (r'^ - r'^f + (tTq - ttq)^ + (fc' - kf < 8 implies 

\f{p,T'^n'„k')~ f{p,r\7To,k)\<e. (11) 

Then under the model {Ip, the empirical Bayes interval ilOfl satisfies 

lim sup i^Ci?7r < q. 

p — >oo 

Proposition 4.1 // the selected parameters are the first R parameters and R —^ 
00 when p — > 00. Then f satisfies the condition ill]) . 

This proposition implies that when all observations are selected for interval 
estimation, ([TO]) can control the empirical Bayes FCR asymptotically. 

However, like all other existing constructions such as Casella and Hwang 
(1983), Qiu and Hwang (2007), and Hwang, Qiu, and Zhao(2008), the interval 
(jlO|) cannot provide a satisfactory answer automatically for the finite sample 
case. 

In figure [H I have plotted a figure of Bayes FCR of the empirical Bayes 
interval versus the procedure of B-Y under different settings of hyper-parameter 
(7ro,T^) when p = 1000 and only the top 100 observations are selected for 
interval estimation. B-Y's procedure can always control the FCR at the 5% 
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level; however, their procedures are way too conservative in terms of extremely 
low Bayes FCR when M is close to 1 and large average length. The green line, 
corresponding to the construction (jlOp . performs well when is relatively large; 
however some modifications are strongly required when is small. 

Qiu and Hwang (2007) have argued that ttq is nearly unidentifiable when 
T is small. This will cause the estimator (|9]) to be very inaccurate. There- 
fore, they mixed their empirical Bayes intervals with the Bonferroni correction 
{Xi — Zq/2p<T, Xi + Zq/2pCr) bascd on a threshold: min(-\/720/p, 0.6), obtained from 
extensive numerical calculations. It also seems necessary to mix the procedure 
(jlOp with the interval {Xi — Zi?g/2pCT, + zi^q/2pO'), which is inspired by B-Y 
(2005). The following analytic argument can help us to find the threshold value 
much easier than Qiu and Hwang (2007). 

Recall that EXf = + ttit^ and EXf = 3(ct'^ + 2TTia^T^ + ttit-*), therefore 

+ 2fT^ = ^gx^-jg • Use TO2 = J2-^i/P ^'^'^ '^4 ~ J2-^i/P denote the 

second and fourth moments, then + 2(7^ = "'^/^^J^ ■ 

Since the left hand side is always greater or equal than 2cr^ , is not estimable 
when the right hand side is less than 2(t^. Therefore, I can carefully choose a 
proper Tq, such that the probability of the right hand side is smaller than 2a^, 
i.e. the probability that ttq and are not estimable, is controlled at the level of 
q. Therefore, set the threshold value t§ to satisfy P^2^^2( "^^^~2 < 2a^) < q. 

Now consider the special case when tti = 1 and calculate Tq . Use 7714 and m'2 
to denote the second and fourth moments of the standard normal distribution 
when there are p observations. Then 7714 = (t^ +(7^)^7714 and 7712 = (t^ +<t^)7(72. 
I choose Tq such that 

P,.=,.((r2 + a^r^ - 2a\r^ + a^)m'2 + < 0) < g 
by simulation. 

Based on the cutoff, the final empirical Bayes FCR controlling interval with 
mixture is defined as 

_\ Xi ± zjiqn2p)cr a 7712 -a- <Tq, , . 

* ~ ( C/f , if 7772 - (7^ > Tq . 

In figure [TJ the red solid line corresponds to the above empirical Bayes inter- 
vals. They perform the same as BY when is very small because of the mixed 
procedure. The portion of the mixture increases when ttq increases. However, 
(|12p performs better than theirs when is larger. The discrepancy is significant 
when M ^ 1. 

I have also plotted the simulated average length in figure [2] that corresponds 
to the same model settings in figure [1] The average length of p2)) is uniformly 
less or equal than the average length of B-Y's procedure. The ratio can even be 
as small as 56%. 

In figures [3] and [H I repeat the simulation setting but change the selection 
rule to Benjamini and Hochberg (1995) 's procedure which aims at finding signif- 
icant observations while controlling the False Discovery Rate at 5%-level. The 
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intervals (fT2|) can control the empirical Bayesian FCR at 5%-level based on this 
data-driven selection. Compared with B-Y (2005) 's procedure, the improvement 
of the average length is even more significant than that corresponding to the 
fixed selection rule. The ratio can be as small as 43%. 

5. Real Data Analysis 

In this section. I apply different intervals to a microarray data set, the Syn- 
teni data of Kerr, Martin, and Churchill (2000), which was revisited by Hsu et 
al.(2006) and Qiu and Hwang (2007). The description of the data set can be 
found in Kerr, et al. (2000). The figure 6 of Qiu and Hwang (2007) is a Q-Q 
plot of the ANOVA estimator Xg, which shows that the normal- mixture model 
^ fits the data well. 

Hsu et al. (2006) uses simultaneous confidence intervals to detect genes with 
an expression level of A = 3 or more. I will first apply Benjamini and Hochberg 
(1995)'s procedure to select parameters with expression levels significantly larger 
or equal than log2 3, and then construct the simultaneous interval for such se- 
lected observations. B-H's procedure declares that the first 89 genes are signifi- 
cant. 

In figure [5l I construct the confidence intervals for these 89 genes by using 
Qiu and Hwang (2007) 's, B-Y's, Our confidence interval ^ for ^(g) is 

0.93X(g) ± 0.96. Compared with the interval X(g) ± 1.47 of BY's procedure, 
0.93X(g) ± 1.67 of Qiu and Hwang (2006), our intervals enjoy great length re- 
duction. 

6. Discussion 

In this article, I have defined a new loss function for confidence interval con- 
struction when assuming the mixed prior model ([T]). I use two different ways 
to choose the tuning parameter in the loss function to obtain Qiu and Hwang 
(2007) 's procedure and the empirical Bayesian FCR controlling intervals. I con- 
clude that the new empirical Bayesian FCR controlling interval is better than 
other existing procedures because of the sharp improvement over the average 
length. 

However, there are still much need for further research. In model I 
assume equal and known variance cr^, which is not generally a practical as- 
sumption. Hwang, Qiu, and Zhao (2008) proposed a double shrinkage empiri- 
cal Bayesian interval for single parameter without selection under the normal- 
lognormal model. Therefore, one natural extension of this work is to consider 
the mixture-prior model when variances are unequal and unknown. The loss 
function ([2]) provides us a potential tool to construct corresponding intervals. 
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Supplementary Materials 

A.l. Technical Details of Mathematical Results 

Proof of Theorem 12.11 

Firstly, 

ELi9,,Ch\X) (A.l) 

= kiLen{Ch)P{h = l\X) ~ j € Ch,h = l)m[e,\X)de, + /c/,(0|X)(fc2 - ^- (^)). 

The integration J IciA^^iili — ^)'m{(^i\X)d9i can be written as j^j m{9i,Ii = 
l\X)de, where n{e„I, = 1\X) = n}{X)7r{ei\I, = 1,X). Write Len{CIi) as 
J^j, Iddi. Then (jA.ip equals to 

ttUX) [ {ki-TT{e,\X,h^l))de, + IciMXKk2-Tr^iX)). (A.2) 

Jc'Ii 

Now consider two intervals CI^ and C/^^ where Clf = {0i : ki < Tr{9i\X,Ii = 
1)} \ {0} and Clf = {9, : ki < TT{9i\X,I, = 1)} U {0}. Then both C// and Clf 
minimize the first term of the formula (|A.2[) . Since S Clf and ^ CI^ , then 

EL{Clf\X) = EL{Cll\X) + (fc2 - 

Consequently, the Bayes interval includes if and only if /c2 < ''^^{X)^ i.e. it is 
the one that is defined in (|4]). 
Proof of Theorem [3711 

According to Zhao and Hwang (2008), 

y,,^^JP{9^<iCU\X) , 

FCR^ = E ^"^^ ^ ^ -^I{R > 0). 

R 

Since 

P{9, i Clf\X) 

= P{9, i Clf\X, I, - 0)P(/, = 0\X) + P{9, i Clf\X, h = \)P{h = \\X) 
= 7r,"(X)/(^O(X)<fe) + (l-7rO(X))P(0,^C/f|X, /, = !), (A.3) 

andP(&, ^C/f|X,/, = 1)<(7, 

FCR. < qE{I{R > o))+j; S»g^^»"(^)(-^( < " > q) = qP{R > 0)+f{k,). 

R 

The choice of k2 ensures that f{k2) < 0. Consequently, 

FCi?^ < qPiR > 0). 

Proof of theorem 14.11 

Before the proof, I will state and prove the following lemma. 
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Lemma A.l Assume that and ttq are consistent estimators of and ttq, 
then for any S > 0, there 3Po > such that Vp > Pqi 

|7r,"-7r°| <5, for all i = 1,2,- ■■ ,p. 

Direct calculation shows that tt? = — and tt? has the 

same form as tt? except that ttq and are replaced by their estimators ttq and 
f^. Now, I introduce an intermediate estimator 7f° where ttq is assumed known. 
I shall prove that the lemma holds for tt," first. 

Since f is consistent, M = rr^. — j- is also a consistent estimator of M. Then, 
for e = i < min(i^(5, — W^^), there exists N, such that Vp > N, \M - 
M| < eM. 

Without loss of generality, assume that M > M, i.e. 0<M-M<eM=^. 
Since M is a increasing function with respect to when cr^ is fixed, therefore 
> f^. Direct calculation shows that 



~o 7ro^ia(^f^cxp(^^^)-l) 

< - < = ^ j—^ 17^3~ 

(tto Vcr^ + f2cxp( — + 7ria-)(7ro +7ri^7=^=5cxp(^^)) 



Since < M < M, 

< ' = - — ^ < 1. 

Consequently, 



t2 -u r2 1 _ M 



0-2 + r2 0-2 + t2 1 - M ' 

Therefore, (|A.4[) implies that 



(ttoVct^ + cxp(-^^^) + 7ricr)(7ro + tti exp(^^^)) 
Since the numerator is negative and the denominator is larger than ttottict, 

TTQTTia 1 — M 

Furthermore, M — M > —eM implies that 

n^~n^>^^{-e)>-S. (A.5) 

On the other hand, 

.o_ ^ 7ro7rig(exp(^^) - 1) _ TToVg^ + exp[^^) - 1 
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I use C to denote the constant ^"'^^^J''^^ ', and let y = exp(^^^), then 
exp(^^^) = y''. If Xi = 0, then y = 1, 

Otherwise, if Xi ^ 0, then y > 1, and 

Combine (jA.Sp and (|A.6p . then 

I*" - 7r°| < max{S, Ce) < S. (A.7) 

Now, assume that ttq is also estimated by ttq. Let A = —^^^^=exp{^^), 
then 



|7r?-^?| 



TTo TTo , I (tto - 7ro)A 



The denominator greater than ttottiA implies that Itt," — -ir"! < | \ ■ Since 
ttq is consistent for ttq, for any 5 > 0, there 3Po such that Vp > Pq, Ittq — ttqI < (5, 
then 

where D is a constant that only depends on ttq. Combining this with (|A.7p . one 
can get that 

Itt" - TT^\ < (1 + D)6, for alH = 1, 2, • • • ,p 

and completes the proof. 
Proof of the theorem 

According to Zhao and Hwang (2008), FCR^ = E ^'^"^ ^^'^'J'^^' '^-" /(i? > 
0) where TZ is the set of index of parameters that are selected and R is the 
number of selected parameters, i.e. R = #7?.. Similarly as formula (|A.3p in the 
proof of theorem 13. 11 

Pie,icif^\x) 

= n^{X)I{nUX) < k) + (1 - 7r^{X))P{e, i Clf^\X,U = 1) 



In the empirical Bayes interval (|10[) . there exists a positive correction term 
— M log Mcr^. Dropping this term results in a short interval which enlarges the 
non-coverage probability, i.e. 

P{e, i Clf'^lX) < P{\e, - MX,f > Ma'^zl,^). 

Consequently, 

P{e, i Clf^\X) < 7r«(X)/(^,«(X) < fc2)+(l-^O(A))P((0,-AfA,)2 > Ma^z'^^^\X,l, = 1). 
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Rearrange the terms in the above formula, one can simply the conditional 
non-coverage probability P{6i ^ Clf^\X) as 

7rO(X)(/(7r°(X) < k) -q)+ 7r^{X)iq - P((a, - MX,)^ > Ma'z^^/^\X,h = 1)) 
+Piie, - MXif > M(T^zl/^\XJ, = 1). 
Let 

= R ' 

, E.eA ^nX){ci Pm MX,r > Ma^zl^\X, h - 1)) 
= R ' 

and 



A3 = 



E.eA pm - MX^f > Ma^zl^\X, h = 1) 



R 

then FCRtt can be controlled from above by -E(Ai + A2 + A3). 

Since ttq and are obtained by using the method of moments, Delta method 
imphes that ttq — tt = Op (;^ ) and — = Op{^). 

According to Lemma (jA.l[) . for any e > 0, I can always find sufficiently large 
Po, such that for any p > Po, {t^ - r'^f < S/3 and {TTf{X) - Trf{X))^ < S/3. 
Consequently, 

EAi < E s = /(p^ r , TTo, fc2 + V 'j/3)- 



R 

Since (f^ - r'^f + (7rO(X) - TT°{X)f + (,5/3)2 < therefore according to the 
property of the function /, 

f{p,T^,7To,k2 + \/5j3) < f{p,T'^,TTo,k2) + £ < 6, 

Since ^2 is simulated as the maximum k2 such that /(p, f^jTTo, k2) < 0, 

EAi < e. (A.8) 

For the second term A2, 

E^^A^KX)\q-Pm - MX,f > Ma^zl^\X,h = 1)1 



IA2I < 



R 

T^,^Aq-pm-MX,Y 

< 



E^eA ll - pm - MX^Y > Ma^zl/,\X,h = 1)| 



R 

Taking a close look at the term P{{6i- MXiY > Ma^z^/^\X,Ii = 1), one 
knows that {di\Xi,Ii = 1) N{MXi, Ma'^). Therefore one can replace 9i by 
MXi + \fMaZ where Z is a standard normal random variable which is inde- 
pendent of Xi. Consequently, 

pm ^yx^r > m<j'zI/2\x,i. = i)=pi\z- i > V2i^)- 

(A.9) 
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Assume that is the observation that has the largest absolute value, then 



into the range 



Consequently, for any i = 1, 2, • • • ,p, (|X9l) falls 



Ma 



' M 

Ti'' 



(A.IO) 



Let = Vcr2 + T-2Zi, then Zi = 7roiV(0, ^^72 ) + 7ri7V(0, 1). Furthermore 



'Ma ■ T 

As a result, the range (jA.lOp can be rewritten as 



(f2+ (72) 



[P{\Z- 



'T(f2 +0-2) 



' M 



M 



Z(,)\\>\\l-z,/2),Pm>\l-r-'',/2)] 



M 



(A.ll) 



Since the above range applies for all i's, one knows that 



IA2I < max{\q-P{\Z- 



a{T^ - t2) 
r(f2 + 0-2) 



Since f2 - t2 = Op(^), = OiV^Wp), 

a{T^ — t'"' 



V(^2+^2)^(P)|-«P(1)- 

The dominated convergence theorem implies that 



Zip)\\ > l\/^V2)l. \<i-Pi\z\ > Y^g ^9/2)1)- 

(A.12) 
(A.13) 



P{\Z- 



a{f 



and 



\\l^z,^,))^P{\Z\> z,/2) = q, 



Applying the dominated convergence theorem again, one can deduce from (jA.12[) 
that 

limsupE'lAal < 0. (A.14) 



p — >oo 



Similar arguments apply to A3 and one can show that 



A. < P(\Z 



|(A/-M)X(rtl 



> 



Ma 



= P(\z-\^K-^M\> 



-Zq/2) 
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Dominated convergence theorem and (jA.13[l implies that 

hmsup£;|A3| < Um EP{\Z\ > Zq/2) = q. (A.15) 

p — >-oo P 

(|A.8p . (|A.14p . and (|A.15P imply that 

limsupFCi?7r < q- 

p — >oo 

Proof of the proposition 14. ll 

Assume that Xi - 7ro7V(0, a^) + (1 - 7ro)iV(0, + cr^) and K; - 7r[,7V(0, cr^) + 
(l-7r^)iV(0,T'2 + cr2) where i = 1,2,--- ,p. Then 

|/(p,7ro,r2,fc)-/(p,7r^„r'2,fc')l 

^Ef=i vrf (X)(/(7rO(X) < fc) - g) - 7r,:0(y)(/K°(r) < fc') - q) 



gEtiK"(^) - <"(y)) + EtiK"(^)^K°(^) < fc) - < fcO) 



Since R goes to oo as p ^ oo, therefore by using the law of large number, the 
inside function of the above expectation converges to A = qE{TTi{X) — n'^ (Y)) + 
E{-kI{X)I{-k1{X) < /fc)-7riO(y)/(7rf (F) < k')) in probability. Since the integral 
is a bounded function, it is sufficient to show that Ve > 0, there exists 6, such 
that {k' - kf + (t'2 - r2)2 + {-k'^ - T^^f < 5 implies that |A| < e. 

In fact, ETr1{X)E{P{ea = 0\X)) = ^(6*0 = 0) = tto. This implies that 

qE{n",{X) - = q{no - tt',). (A.16) 

Furthermore, direct calculation shows that 

£;(7r?(X)/«(X) <k))= f 7Tl{X)m{X)dX = ^=L= exp(-|^)dx. 

Since {7rO(X) < k} implies that > 2^(log + log 

E{^l{x)i{^Ux) < fc) - < k')) 

,9 2(7^,, 1 — A; , TTn ,9 2(7^,, 1 — fc' , 

- ^(1^1 > ^ + . /A^ )) - ^(1^1 > 177a«g ^ + log - 



° k °7riVl-M M'^ " fc' TT^Vl -M' 

where A'^ is a standard normal random variable. When fc, fc' are close to 1, then 
logi^ ^ -00, therefore, \E{ttI{X)I{itI{X) < fc) - < < fc'))| = 
if fc, fc' > ei where ei < 1 is close to 1 sufficiently. Similarly, if fc, fc' are close to 0, 
then log ^ 00. I can choose sufficiently small eq, such that when fc, fc' < eo. 
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Consequently, \E{t:\{X)I{'k\{X) < k) - < (r)/«(y) < k'))\ < e when k,k' 
are either close to or 1. 

Furthermore, assume that < cq < fc. A:' < ei < 1, then by the continuity of 
£;(7rJ(X)/(7r?(X) < A:) - Tr'^{Y)I{Tr'^(Y) < k')), there exists a small S < e, such 
that (fc' - fc)2 + (t'2 - t2)2 + {n'„ - Tro)^ < S implies t hat \E {TTf{X)I{Tr°{X) < 
k) - 7rf (r)/(7ri°(y) < fc'))| < e. Combining this with (|A.16p . one obtains that 
|A| < e when J is sufficiently small, which completes the proof. 

A. 2. Figures and Graphs 
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FCR:p=1000,piO=0.3,q=0.05 FCR:p.1000,piO=0.5,q.0.05 




0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 

M M 



Fig 1 . This figures are the simulated Bayes FCR under different model settings against M = 

2 

Y^^. The dimension is set to be 1000, and top 100 observations after ordering all Xi's 
according to their magnitude are selected for confidence interval construction. The hyper 
parameter ttq varies among 0.3,0.5,0.8 and 0.9. The Bayes FCR level that I aim at is 5%. 
When is small, Ifl0\) doesn't control the Bayes FCR at 5%; however, the mixed procedure 
il2V does control the Bayes FCR for any hyper parameters. The portions of the mixture 
increase as no increases. 
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Fig 2. This figures are the simulated average lengths of different approaches under the same 
model settings as figure Q] The average length of my procedure is less or equal than B- Y's 
procedure. In some extremely cases, the average length of 112\} is only 54% of that of B- Y's 
procedure. 
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Fig 3. This figures are the simulated Bayes FCR under different model settings against M = 

2 

Yq-^ . The dimension is set to be 1000. The selection rule is based on Benjamini and Hochberg 
(1995)'s procedure which aims at controlling the False Discovery Rate to be less or equal than 
5%. The hyper parameter ttq varies among 0.3,0.5,0.8 and 0.9. The Bayes FCR level that I 
aim at is 5% which is represented by the magenta line. When is small, UlUi l doesn't control 
the Bayes FCR; however, FCRs of the mixed procedure ilS\ } and B- Y's procedure are always 
less or equal than the error bar which equals to q plus the simulation error. 
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leng:p=1000,pi0=0.3,q=0.05 leng:p=1 000,pi0=0.5,q=0.05 




0.1 0,2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 

M M 



Fig 4. This figures are the simulated average length of different approaches under the same 
model as figure Q] The average length of my procedure less than B- Y's procedure. In some 
extremely cases, the average length of (12V is only 44% °f ihat of B-Y's procedure. 
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Fig 5. Three different interval approaches, Qiu and Hviang (2007), B-Y (2005), and fl2^} 
are applied to the Synteni data of Kerr, Martin, and Churchill (2000). B-H (1995)'s FDR 
procedure uihich aims at finding the genes with differentially expressed levels to be significantly 
larger or equal than log2 3 uihile controlling the False Discovery Rate to be less or equal than 
5% is applied to select genes for interval estimation. Among 1285 genes, 89 of them are 
declared significant and the corresponding intervals are constructed and plotted in this figure. 
From the figure, one can see that the center of Qiu and Hwang (2007) 's procedure is the 
same as il2V . However, since they aim at controlling the simultaneous coverage coefficient 
by using Bonferroni's correction, lengths of their intervals are much larger than that of (12(1 . 
B-Y (2005) centers their intervals at the biased estimator Xf^i^ 's. Thus they end up correcting 
the selection bias by increasing the length and it turned out that their lengths are much larger 
than that of hl2\l . However, B-Y's length is slightly smaller than Qiu and Hwang (2007)'s 
procedure. 
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